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We show that quantum-to-classical channels, i.e., quantum measurements, can be asymp- 
totically simulated by an amount of classical communication equal to the quantum mutual 
information of the measurement, if sufficient shared randomness is available. This result 
generalizes Winter's measurement compression theorem for fixed independent and identically 
distributed inputs [Winter, CMP 244 (157), 2004] to arbitrary inputs, and more importantly, 
it identifies the quantum mutual information of a measurement as the information gained 
by performing it, independent of the input state on which it is performed. Our result is a 
generalization of the classical reverse Shannon theorem to quantum-to-classical channels. In 
this sense, it can be seen as a quantum reverse Shannon theorem for quantum-to-classical 
channels, but with the entanglement assistance and quantum communication replaced by 
shared randomness and classical communication, respectively. The proof is based on a novel 
one-shot state merging protocol for "classically coherent states" as well as the post-selection 
technique for quantum channels, and it uses techniques developed for the quantum reverse 
Shannon theorem [Berta et al, CMP 306 (579), 2011]. 



I. INTRODUCTION 

Measurement is an integral part of quantum theory. It is the means by which we gather in- 
formation about a quantum system. Although the classical notion of a measurement is rather 
straightforward, the quantum notion of measurement has been the subject of much thought and 
debate [1]. One interpretation is that the act of measurement on a quantum system causes it to 
abruptly jump or "collapse" into one of several possible states with some probability, an evolu- 
tion seemingly different from the smooth, unitary transitions resulting from Schrodinger's wave 
equation. Some have advocated for a measurement postulate in quantum theory [20], while oth- 
ers have advocated that our understanding of quantum measurement should follow from other 
postulates [62]. 
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In spite of the aforementioned difficulties in understanding and interpreting quantum measure- 
ment, there is a precise question that one can formulate concerning it: 

How much information is gained by performing a given quantum measurement? 

This question has a rather long history, which to our knowledge begins with the work of Groe- 
newold [24]. In 1971, Groenewold argued on intuitive grounds for the following "entropy reduction" 
to quantify the information gained by performing a quantum measurement: 

Hip)-Y,P^H{p,), (1) 

X 

where p is the initial state before the measurement occurs, {px,Px} is the post-measurement en- 
semble induced by the measurement, and H{a) = — tr[cjlogcr] is the von Neumann entropy of a 
state a. The intuition behind this measure is that it quantifies the reduction in uncertainty after 
performing a quantum measurement on a quantum system in state p, and its form is certainly 
reminiscent of a Holevo-like quantity [26], although the classical data in the above Groenewold 
quantity appears at the output of the process rather than at the input as in the case of the Holevo 
quantity. Groenewold left open the question of whether this quantity is non-negative for all mea- 
surements, and Lindblad proved that non- negativity holds whenever the measurement is of the von 
Neumann-Liiders kind (projecting onto an eigenspace of an observable) [38]. Ozawa then settled 
the matter by proving that the above quantity is non-negative if and only if the post-measurement 
states are of the form 

Px = 1 , (2) 

tT[MlM^p] 

for some operators {M^} such that M^Mx = 1 [43]. Such measurements are termed "efficient", 
and differ from general measurements as the latter may have several operators M^^s corresponding 
to the result x [23]. 

The fact that the quantity in (1) can become negative for some quantum measurements ex- 
cludes it from being a generally appealing measure of information gain. To remedy this situation, 
Buscemi et al. later advocated for the following measure to characterize the information gain of a 
quantum measurement when acting upon a particular state p [9, 39, 50, 57]: 

I{X : R)^ , (3) 

where I{X : R)^ = H{X)i_j + H{R)^ — H{XR)uj is the quantum mutual information of the following 
state: 

^^XB. = ^\x){x\x^ti:A{iMx^lR)i\p){p\AB)}- (4) 

X 
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The register X is a classical register containing the outcome of the measurement, Ai = {A^^} is 
a collection of completely positive, trace non-increasing maps characterizing the measurement (for 
which the sum map Ylx-^^ trace preserving), I is the identity map, and \p)ar is a purification 
of the initial state p on system A to a purifying system R. The advantages of the measure of 
information gain in (3) are as follows: 

• It is non-negative. 

• It reduces to Groenewold's quantity in (1) for the special case of measurements of the form 
in (2) [9]. 

• It characterizes the trade-off between information and disturbance in quantum measure- 
ments [9]. 

• It has an operational interpretation in Winter's measurement compression protocol as the 
optimal rate at which a measurement gathers information [60]. 

This last advantage is the most compelling one from the perspective of quantum information 
theory — one cannot really justify a measure as an information measure unless it corresponds to a 
meaningful information processing task. Indeed, when reading the first few paragraphs of Groe- 
newold's paper [24], it becomes evident that his original motivation was information theoretic in 
nature, and with this in mind. Winter's measure in (3) is clearly the one Groenewold was seeking 
after all. 

In spite of the above arguments in favor of the information measure in (3) as a measure of 
information gain, it is still lacking in one aspect: it is dependent on the state on which the quantum 
measurement acts in addition to the measurement itself. A final requirement that one should 
impose for a measure of information gain by a measurement is that it should depend only on the 
measurement itself. A simple way to remedy this problem is to maximize the quantity in (3) over 
all possible input states, leading to the following characterization of information gain: 

I{M) = max I{X : R)^, (5) 

PAR 

for ujRx as in (4). The quantity above has already been identified and studied by previous authors 
as an important information quantity, being labeled as the "purification capacity" of a measurement 
[33, 34] or the "information capacity of a quantum observable" [29]. The above quantity also admits 
an operational interpretation as the entanglement-assisted capacity of a quantum measurement 
for transmitting classical information [3, 28, 29], though it is our opinion that this particular 
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Reference _R" Reference R" 




FIG. 1. Simulation (left) of the measurement A*^" (right). In the simulation, Alice uses shared randomness 
to perform a new measurement, whose result she communicates to Bob, such that Bob can recover the actual 
measurement output X" using the message and the shared randomness. If the simulation scheme works for 
any input, we can associate the amount of communication with the information gained by the measurement. 

operational interpretation is not sufficiently compelling such that we should associate the measure 
in (5) with the notion of information gain. The main aim of this paper is to address this issue by 
providing a compelling operational interpretation of the measure in (5). 



II. SUMMARY OF RESULTS 



In this paper, our main contribution is to show that is the optimal rate at which a 

measurement gains information when many identical instances of it act on an arbitrary input state. 
In our opinion, this new result establishes (5) as the information-theoretic measure of information 
gain of a quantum measurement. In more detail, let A denote the input Hilbert space for a given 
measurement A4. We suppose that a third party prepares an arbitrary quantum state on a Hilbert 
space A®", which is equivalent to n identical copies of the original Hilbert space A, where n is 
a large positive number. A sender and receiver can then exploit some amount of shared random 
bits and classical communication to simulate the action of n instances of the measurement A4 
(denoted by Ai'^") on the chosen input state, in such a way that it becomes physically impossible 
for the third party, to whom the receiver passes along the measurement outcomes, to distinguish 
between the simulation and the ideal measurement Ai^"' as n becomes large (the third party can 
even keep the purifying system of a purification of the chosen input state in order to help with 
the distinguishing task). By design, the information gained by the measurement is that relayed by 
the classical communication. Following [60], we call this task universal measurement compression. 
We prove that the optimal rate of classical communication is equal to I{A4), if sufficient shared 
randomness is available. 
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The information-theoretic task outhned above is also known as channel simulation (depicted in 
Figure 1), and it has been well studied for the case of fully classical channels (with classical inputs 
and classical outputs) [3, 13, 14] and fully quantum channels (with quantum inputs and quantum 
outputs) [2, 3, 7]. The "in-between" case of channels with quantum inputs and classical outputs 
(i.e., measurements) has been studied as well [60] (see also [57]), but as mentioned above, the 
problem of simulating many instances of a quantum measurement on an arbitrary input state has 
not been studied before this paper. Beyond its intrinsic interest as an information-processing task, 
channel simulation has two known concrete applications: in establishing a strong converse rate for 
a channel coding task [2-5] and in rate distortion coding (lossy data compression) [16-18, 59]. 

Our paper also features some related results of interest. We characterize the optimal rate region 
consisting of the rates of shared randomness and classical communication that are both necessary 
and sufficient for the existence of a measurement simulation, whenever both the sender and receiver 
are required to obtain the measurement outcomes (this is known as a feedback simulation since the 
sender also obtains the measurement outcomes). We also characterize the optimal rate region of 
shared randomness and classical communication for a non-feedback simulation, in which the sender 
is not required to obtain the measurement outcomes. Note that if sufficient shared randomness is 
available and we are only interested in quantifying the rate of classical communication, then there 
is no advantage of a non-feedback simulation over a feedback one — the optimal rate of classical 
communication is given by (5). 

Our proof technique in this paper exploits ideas from the approach in [7] for proving the fully 
quantum reverse Shannon theorem. In fact, one can think of our approach here as a "classicalized" 
or "dephased" version of that approach. In particular, we begin by establishing a protocol known as 
"classically coherent state merging," which is a variation of the well-known state merging protocol 
[30, 31] specialized to classically coherent states (see Section III for definition). We then show how 
time-reversing this protocol and exchanging the roles of Alice and Bob leads to a protocol known 
as "classically coherent state splitting." It suffices for our purposes for this protocol to use shared 
randomness and classical communication rather than entanglement and quantum communication, 
respectively. Generalizing this last protocol then leads to a one-shot state-and-channel simulation 
which is essentially optimal when acting on a single copy of a known state. Finally, we exploit the 
post-selection technique for quantum channels [10] and the aforementioned state splitting protocol 
to show that it suffices to simulate many instances of a measurement on a purification of a particular 
de Finetti quantum input state in order to guarantee that the simulation is asymptotically perfect 
when acting on an arbitrary quantum state. We then show that applying very similar reasoning 
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as above along with randomness recycling [2] solves the non-feedback case. 

We organize this paper as follows. In Section III, we introduce our notation and review prelimi- 
nary concepts such as states, distance measures, channels, isometrics, entropies, smooth entropies, 
and classically coherent states. Section IV then introduces one-shot protocols for state merging 
and state splitting of classically coherent states (the classical state splitting protocol turns out to 
be the most important tool for proving our main result). Section V provides a proof of our main 
results for the case of feedback and non-feedback simulations, and we shortly comment on possible 
extensions and applications in Section VI. We finally conclude in Section VII by summarizing our 
results and stating some directions for future research. 



States, Distance Measures, Channels, Isometries. Let A, B,C, . . . denote finite dimen- 
sional Hilbert spaces and let \A\ denote the dimension of A. We establish notation for several 
sets: C{A) linear operators on A, V{A) non-negative linear operators on A, S<{A) = {p^ e 
ViA) I tr[p] < 1} subnormalized states on A, S{A) = {pA G T^iA) \ tr[p] = 1} density operators 
or states on A, and V{A) = {pA G <S{A) \ tr[/9^] = 1} pure-state density operators on A. We 
define the purified distance P^pa^cta) = \/l — F'^{pA, o"a) for pa^cta G S<{A), where F{pA,crA) = 
F{pa,o'a) + Y^(l — tr[/9A])(l — tr[cryi]), and the quantum fidelity F{pa,o'a) = ||\//M\/o\4||i with 




llr^lli = tr[YrAry for G ^(^)- We use the notation pA to indicate that pA and a a 

are e-close with respect to the purified distance: P{pa,o'a) < £• We define the e-ball around pA 
as 13^{pa) = {pA G '5<(j4) : PA ~e pa}- The tensor product of two Hilbert spaces A and B is 
denoted by AB = A(gi B. Given a multipartite operator pab S V{AB), we unambiguously write 
PA = ticsipAB] for the corresponding reduced operator. For Ma G we write Ma = Ma fS" Is 

for the enlargement on any joint Hilbert space AB, where Ib denotes the identity operator acting 
on C{B). Isometries from A to B are denoted by Va~^b- For Hilbert spaces A, B with orthonor- 
mal bases {|i)A}'='i5 {|Ob}'=i and |^| = \B\, the canonical identity mapping from C{A) to C{B) 
with respect to these bases is denoted by Ia~^b, i-e., lA-^B{\i){j\A) = K)0|b- A linear map 
£a^b '■ ^{A) — )• C{B) is positive if £a-^b{pa) G for all pA S 'P{A). It is completely positive 

if the map {Sa^b ®'^c->c) is positive for all C. Completely positive and trace preserving maps 
are called quantum channels. The support of pA £ ^(^) is denoted by supp(/9A), the projector 
onto supp(/9yi) is denoted by p\ and tr[p^] = rank(/7A), the rank of pA- For pA G 'P(^) we write 
1 1 Pa 1 1 oo foi' tlie operator norm of pA, which is equal to the maximum eigenvalue of pA- 



III. PRELIMINARIES 
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Diamond Norm. We will need a distance measure for quantum channels. We use a norm 
on the set of quantum channels which measures the bias in distinguishing two such mappings. In 
quantum information theory, this norm is known as the diamond norm [37]. Here, we present it in a 
formulation which highlights that it is dual to the well-known completely bounded (cb) norm [44]. 

Definition 1. Let Ea '■ i— >■ /^(-B) &e a linear map. The diamond norm of 8a is defined as 

\\£a\\o = sup \\£A^Ik\\i , (6) 
fceN 

where ||-7^4||i = sup^-g^^^^) andl/. denotes the identity map on states of a k-dimensional 

quantum system. 

The supremum in Definition 1 is reached for A; = |A| [37, 44]. Two quantum channels £ and J- 
are called e-close if they are e-close in the metric induced by the diamond norm. 

Classically Coherent States. We say that a pure state \'4'){'ip\xAXBR ^ V{XaXbR) is 
classically coherent with respect to systems Xa^b if there is an orthonormal basis {\x)} such that 

can be written in the following form: 

\i^)xAXBR = J2 \fVx\xx)xAXB ® (7) 

X 

for some probability distribution px and states IV'x)/?- Harrow realized the importance of classi- 
cally coherent states for quantum communication tasks [25], while Refs. [22, 51] recently exploited 
this notion in devising a "decoupling approach" to the Holevo-Schumacher- Westmoreland coding 
theorem [27, 49] that is useful for our purposes here. Classically coherent states are also related to 
Zurek's approach to decoherence [61], in which classicality arises from an inaccessible environment 
possessing an "imprint" of a classical state in superposition (as in the above state if we think of 
Xb as an environment). 

Entropies. Recall the following standard definitions. The von Neumann entropy of pA £ 5(^4) 
is defined as^ 

ii{A)p = -\,Y\pA\ogpA\. (8) 
The quantum relative entropy of pA S S<{A) with respect to a a S Vi^A) is given by 

(pa 1 1 o"A ) = tr [p^ log /3a] - tr \PA log cta] , (9) 



^ All logarithms in this paper are taken to base 2. 
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if supp(p^) C supp(cj^) and oo otherwise. The conditional von Neumann entropy of A given B for 
PAB G S{AB) is defined as 

H{A\B)p = -D{pabUa ®Pb) . (10) 
The mutual information between A and B for pab G S{AB) is given by 

I{A:B)p = D{pab\\pa®Pb). (11) 

Note that we can also write 

H{A\B)p = - inf D{pAB\\tA®(TB), (12) 

crB&S{B) 

I{A : B)p = inf D{pab\\pa ® ob) ■ (13) 

UBeS{B) 

Smooth Entropies. We now give the definitions of the smooth entropy measures that we need 
in this work. We define the max-relative entropy of pA £ S<[A) with respect to a a £ 'P(^) as [15] 

Anax(pAlkA) = inf{A G M : 2^ • tj^ > /j^} • (14) 
The conditional min-entropy of A given B for pab £ S<{AB) is defined as 

H^iri{A\B) p = - inf Anax(PAB||lA <8) O-fi) . (15) 

<TB&S{B) 

In the special case where B is trivial, we get H^i^[A)p = — log ||/5a||oo- The max-information that 
B has about A for pab G S<_{AB) is defined as [7] 

I^s.y.{A: B)p= inf D^^{pab\\pa® cfb) ■ (16) 
(TB<^S{B) 

Note that, unlike the mutual information, the max-information is not symmetric in its arguments.^ 
Smooth entropy measures are defined by extremizing the non-smooth measures over a set of 
nearby states, where our notion of "nearby" is expressed in terms of the purified distance. The 
smooth max-information that B has about A for pAB G S<i{AB) is defined as 

Il,M--B)p= inf I^UA:B)-p. (17) 

In contrast to the non-smooth case, the smooth max-information is approximately symmetric in 
its arguments. 



For a further discussion of max-based measures for mutual information, see [11] 
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Lemma 1. [11, Corollary 4-^-4] Let e > 0, e' > 0, and pab £ S{AB). Then, we have that 

I'^Jf'iB : A)p < Il^M : B)p + log(^ + 2) , (18) 

and the same holds for A and B interchanged. 

For technical reasons, we also need the following entropic quantities. For e > 0, and pA G S<:{A), 
the max-entropy and its smooth version are defined as 

^max(^)p = 21ogtr[py'], (19) 
Hl,AA)p= inf H^.M)p- (20) 

PA 66= (pa) 

Furthermore, the zero-Renyi entropy and its smooth version are defined as 

ifo(^)p = logrank(p^), (21) 
Hf,{A)p= inf H,{A)-p. (22) 

Since all Hilbert spaces in this paper are assumed to be finite dimensional and the ball is 
convex and compact [52], we can replace the infima by minima and the suprema by maxima in all 
the definitions of this section. We will do so in what follows. 



IV. CLASSICALLY COHERENT STATE MERGING AND STATE SPLITTING 

We first establish "one-shot" protocols for state merging and state splitting of classically coher- 
ent quantum states. The classical state splitting protocol established in this section will then be 
the basis for the universal measurement compression protocol discussed in the next section. 

Definition 2 (State Merging for Classically Coherent States). Consider a bipartite system with 
parties Alice and Boh. Let e > 0, and PXaXbBR S V{XaXbBR) be classically coherent on XaXb 
with respect to the basis {l^;)}, where Alice controls Xa, Boh XbB, and R is a reference system. 
A quantum protocol E is called an e-error state merging of PXaXbBR if it consists of applying local 
operations at Alice's side, sending q quhits from Alice to Bob, local operations at Bob's side, and 
it outputs a state ujXg,XBBBXA-,Bi = {£ ®'^r){pxaXbBr) such that 

I^Xb^XbBRXa^Bi 'Ixa^Xbi{PXaXbBr) ® 4>Xa^Bi j (23) 

where (Pxa Bi ^ maximally entangled state of Schmidt rank E. The quantity q is called the 
quantum communication cost, and e = [log E\ the entanglement gain. 
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FIG. 2. The protocol from the proof of Lemma 2 for state merging of a classicaUy coherent state on systems 
RXaXbB. The operation P is a permutation of states in the orthonormal basis of Xa, and it also 

splits Xa into two subsystems. The operation V is an isometry guaranteed by Uhlmann's theorem to 
complete the merging task, while also generating entanglement between Alice and Bob. 

Lemma 2. Let e > 0, and px^XgBR G V{XaXbBR) be classicaUy coherent on X^Xb with respect 
to the basis {|x)}. Then there exists an e-error state merging protocol for PXaXbBR with quantum 
communication cost 



Ho{Xa)p - H^,in{XA\R)p + 4 • log 



1 



(24) 



and entanglement gain 



H^iniXA\R)p-4-log 



1 



(25) 



Proof. The intuition is as follows. First Alice applies a particular permutation Pxa^Xa^Xa2 
the basis it also splits the output into two subsystems Xai and Xa2- Then she sends 

Xa2 to Bob, who finally performs a local isometry Vx^^XBB-^XgiXBBBi- After Alice applies the 
permutation, the state on XaiR is approximately given by jx^\ ^ PR ^'^^ ^oh holds a purification 
of this. But i^^^i (g) PR is the reduced state of Px^iXeBR ® 4'xa Bi' ^^'^ since all purifications are 
equivalent up to local isometrics, there exists an isometry Vxa^XeB^XbiXeBBi on Bob's side that 
transforms the state into pXgiXsBR ® 4'xa Figure 2 depicts this protocol. 

More formally, let Xa = Xa.Xa^ with log {Xa^I = [log \Xa\ - H^i^{XA\R)p + 4 • log i]. Ac- 
cording to Proposition 27 concerning permutation based extractors, there exists a permutation 
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Pxa^Xa^Xa^ such that for (JXa^Xa.^br = Pxa-^Xa^Xa^{pxaXbBr), 

^Xa. 



^Xa.R 



'PR 



(26) 



By an upper bound of the purified distance in terms of the trace distance (Lemma 24) , this impHes 
(^Xa^R ~e \Xa^\ ® P^' ^^^'^^ appHes this permutation Pxa^Xa^Xa.2 ^^"^ then sends to Bob; 
therefore 



\og\XA\- H^^{Xa\R)p + ^ -log- 



(27) 



Uhlmann's theorem [36, 55] guarantees that there exists an isometry Vxa^XeB^XbiXbSBx such 
that 

^Xa, 



p{^Xa^R^ \X^\ ~ ^{yXA.^XBB^XgiXBBBA'^XA^XA^XBBR), 4>Xa^Bx ® PXb>XbBR^ 



Hence the entanglement gain is given by 



H^,^{Xa\R)p- 4. -log 



1 



(28) 



(29) 



p'xa 



Now if pxa J^^-s full rank, this is already what we want. In general logtr 
log \ Xa\- But in this case we can restrict Xa to the subspace on which pxa has full rank, i.e. 
those X for which px ^ □ 

Definition 3 (State Splitting for Classically Coherent States). Consider a bipartite scenario with 
parties Alice and Boh. Let e > 0, and paXaXaiR ^ ^{-^XaXaiR) be classically coherent on 
XaXa' with respect to the basis {\x)}, where Alice controls AXaXa', and R is a reference system. 
Furthermore let (pAiBi ^ maximally entangled state of Schmidt rank E shared between Alice and 
Bob. A quantum protocol E is called an e-error state splitting of paxaXa/R ^/*^ consists of applying 
local operations at Alice's side, sending q qubits from Alice to Bob, local operations at Bob's side, 
and it outputs a state ojaXaXbR — ^^r){paXaXa'R ^ '^AiBi) such that 

I^AXaXbR ^e'^XAi^XBiPAXAXA'R) ■ (30) 

The quantity q is called the quantum communication cost, and e = [log E] the entanglement cost. 

Lemma 3. Let e > 0, and paXaXa/R ^ V{^^aXa'R) be classically coherent on XaXa' with 
respect to the basis {\x)}. Then there exists an e-error state splitting protocol for paXaXa/R 'with 
quantum communication cost 

r 



Ho{X_ 



A')p 



H^i^{XA'\R)p + 4 -log 



(31) 
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Reference 



Alice 



Reference R 



A 




A 


Xa 


Xa 


V Xa^ 




A, 



Bob 




(a) 






p-1 


Xb i 









(b) 



FIG. 3. (a) A simple protocol for state splitting obtained by time- reversing the state merging protocol 
of Lemma 2 and interchanging the roles of Alice and Bob. (b) If it is not necessary to maintain the 
quantum coherence of the X systems (if they can be dephased to classical registers) , then the state splitting 
protocol can exploit shared randomness and classical communication instead of entanglement and quantum 
communication, respectively. 



and entanglement cost 



i?min(^A'|^)p-4-log 



(32) 



Proof. We get the desired state splitting protocol by time-reversing the state merging protocol of 
Lemma 2 and interchanging the roles of Alice and Bob. Figure 3(a) depicts the state splitting proto- 
col for classically coherent states. More precisely, we first define an isometry Vx., XaA^x.,XaAAi, 
analogously to Vxj^^XBB^Xg,XBBBi of (28) in the state merging protocol. Because all isometrics 
are injective, we can define an inverse of V acting on the image of V (which we denote by lm{V)). 
The inverse is again an isometry and we denote it by ^^^y)_^x > XaA' '^^^ protocol starts by 
measuring the AXaX^iAi systems to decide whether paXaX^i ® '/'Ai ^ Ini(y) or not. If so, the 
protocol proceeds by applying the isometry y{^(^-^-^_^x iXaAi otherwise the state is discarded 
and replaced with |0)(0|x ./ XaA- This step is necessary because the output of merging is not exactly 
PAXaXaiR- The next step is to send X^' to Bob, who then applies the permutation Px^^ Bi->x 

2 A^ 1 S 

defined analogously to Pxa^Xa^Xa2 ™ (26)- By the monotonicity of the purified distance, we get 
a state that is e-close to Tx^,^Xb{paXaXair)- Q 

If we are not concerned with the coherence of the registers Xa and Xb shared between Alice 
and Bob, then the protocol given above (Lemma 3) also works if the entanglement assistance and 
the quantum communication are replaced by the same amount of shared randomness assistance 
and classical communication, respectively. More precisely, we define: 
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Definition 4 (Classical State Splitting of Classically Coherent States). Consider a bipartite system 

with parties Alice and Bob. Let e > 0, and paXaX^,r £ "^i^^A^A'R) be classically coherent on 

XaXa' with respect to the basis where Alice controls AXaXa', and R is a reference system. 

Furthermore let (^Xa Xg denote S bits of shared randomness shared between Alice and Bob. A 

quantum protocol E is called an e-error classical state splitting of paXaX^iR consists of applying 

local operations at Alice's side, sending c bits from Alice to Bob, local operations at Bob's side, and 

5 

it outputs a state loaXaXbR = {£ ^ 1^r){paXaX^,r ^ 4>Xa-,Xb^) ^'^^^ ^^"^ 

I^AXaXbR «e '^{x\paxaXa:r\x)xa' ® \x){x\xb ■ (33) 

The quantity c is called the classical communication cost, and s = [log S~\ shared randomness cost. 

Using the achievability of state splitting of classically coherent states (Lemma 3) we get the 
following. 

Corollary 4. Let e > 0, and paXaX^,r £ ^{AXaXa'R) be classically coherent on XaXa> with 
respect to the basis {l^;)}. Then there exists a classical e-error state splitting protocol for paXaX^,r 
with classical communication cost 

1" 



Ho{Xa')p - H^^{Xa'\R)p + 4 • log ^ 
and shared randomness cost 

H^^{X A'\R)p -4.- log 



(34) 



(35) 



Proof. Note that it is sufficient to find a protocol for state splitting of classically coherent states 
(as in Definition 3) that only works up to random phase flips on the Xb register. These random 
phase ffips then commute with the action of the permutation that takes systems Bi and X^/^ to 
Xb- Thus, if we use the protocol for state splitting of classically coherent states described before 
(Lemma 3), random phase ffips on Xb are the same as random phase ffips on Xji^i^Bi before the 
permutation -P^^, Bi^x applied. Since random phase flips on Bi just transform the maximally 
entangled state (lyAiBi to shared randomness (f'XA Xb °^ same size (with the relabeling of 
AiBi to Xa^Xb^), and they dephase the quantum system Xj^/^ to a classical system, the protocol 
of Lemma 3 also works for classical state splitting of classically coherent states. □ 

Note that the above idea is similar to how Hsieh et al. recovered the Holevo-Schumacher- 
Westmoreland coding theorem for classical communication from a protocol for entanglement- 
assisted classical communication [32], simply by dephasing shared entanglement to common ran- 
domness and replacing random unitaries with random permutations. 



14 



However, the classical communication cost of this protocol is not yet optimal (for the general 
one-shot case considered here). To improve this, we use an idea from a recent proof of the quantum 
reverse Shannon theorem, and Theorem 6 demonstrates that the rate found in terms of the smooth 
max-information is essentially optimal. The following lemma is the crucial ingredient for the proof 
of our main result: universal measurement compression (Theorem 7). 

Theorem 5. Let e > 0, e' > 0, and paXaX^,r £ V(^X^X^'ii) be classically coherent on Xj^Xj^' 
with respect to the basis {\x)}. Then there exists a classical {e + e' + VSe' + \Xai\^^/'^) -error state 
splitting protocol for paXaX^iR with 

c < li^^iXA' : i?)p + 4 • log ^ + 4 + log log \Xa' \ (36) 
c + s< H'^{Xa')p + 2 + loglog \Xa'\ , (37) 

where c denotes the classical communication cost, and s the shared randomness cost. 

Proof. The idea for the protocol is as follows. Let paXaX^/R = \p){p\aXaXa'R with 

\p)aXaXa,R = ■ \xx)xaXa, ® \p'')ar ■ (38) 

X 

First, in our proof, we disregard all the x with px < \Xa'\~'^- This introduces an error 
but the error at the end of the protocol is still upper bounded by due to the monotonicity 

of the purified distance. As the next step, we let Alice perform a measurement Wxai^XaiYa with 
roughly 2 • log|Xyi/| measurement outcomes in the basis {la;)}^;^^^, . That is, the state after the 
measurement is of the form 

^^axaXa^rya = Y(iy p\xaXa,r ® \yMyA > (39) 

y 

where the index y indicates which measurement outcome occurs, qy denotes its probability, and 
p\xaXaiR ^® corresponding post-measurement state. Then conditioned on the index y, we use 
the classical state splitting protocol for classically coherent states from Lemma 4 for each state 
p\xaXaiR^ and denote the corresponding classical communication cost and shared randomness cost 
by Cy and Sy, respectively. The total amount of classical communication we need for this is no larger 
than maxy c^, plus the amount needed to send the register Ya (which is of order loglog I^a'D- The 
sum cost is no larger than maxy Cy + Sy (along with the amount for sending Ya). This completes 
the description of the classical state splitting protocol for paXaXaiR- that remains to do is to 
bring the expression for the classical communication cost and the sum cost into the right form. In 
the following, we describe the proof in detail. 
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Let Q = \2- log \Xa'\ -1], Y = {0, 1, . . . , Q, (Q + 1)} and let {T^^,}yeY be a collection of 
projectors on Xa' defined as 



T^T= E I^X-I^." ^x„= E I^X-l^." (40) 

o<p,<2-2'°s'- 

and for 2/ = 0, 1, . . . , (Q — 1) as 



^1,,= E i^x-i^A'- (41) 

2-(!/+l)<p^<2-!' 

These define a measurement 



w^x,,^x,,y,(-) = Y.T'xJ-n,, ® lyXylvA , (42) 

where the vectors |y)Y;4 form an orthonormal basis, and Ya is at Alice's side. Furthermore let 



Qy = tr 



(43) 



PAXAXA,R = <ly ■Txa,PAXaXa,rTI^,, (44) 
and define the sub-normalized state 

Q 

PAXaXa.R = ^%- PaxaXa,R ■ (45) 

We have 



P{PAXaXaiR-, PAXaXa/R) = V 1 ~ F'^{PAXaXaiR-, PAXaXaiR) (46) 



< 



l-Y.qy = < \/\Xa'\ ■ 2-2l°gl^A'l = |X^,|-V2 . (47) 

y=0 



We proceed by defining the operations that we need for the classical state splitting protocol for 
PAXaXa'R- We want to use the e-error classical state splitting protocol from Corollary 4 for each 
PaXaXaiR' y = 0) 1) ■ • • ) Q this protocol has a classical communication cost 

Cy < Ho{XA')py - H^i^{XA'\R)py + 4 • log ^ + 1 , (48) 

and sum cost 

Cy + Sy<Ho{XA')py, (49) 
where Sy denotes the shared randomness cost. 
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Reference 



R 



Bob 



Alice 




FIG. 4. Our final one-shot protocol for state splitting that achieves the smooth max-information rate of 
Theorem 5. The converse theorem in Theorem 6 guarantees that this one-shot protocol is essentially optimal 
in terms of its classical communication cost. 

For Xai on Alice's side, Xb^ on Bob's side, and X^y, X^v 2*^' -dimensional subspaces of Xa^, 
Xbi respectively, the classical state splitting protocol from Corollary 4 has basically the following 
form: apply some isometry Vax.iXaX y^AX, syX* on Alice's side, send X(j^i)y from Alice to Bob 
(relabel it to X^y), and then apply some isometry Ux yX y^B on Bob's side {Ux yX y-^B is the 
inverse permutation discussed in the proof of Corollary 4). As the next ingredient, we define the 
operations that supply the shared randomness of size Sy. For y = 0, 1, . . . , Q, let Sx y and Sx y 
be the local operations at Alice's and Bob's side respectively, that put shared randomness of size 

Sy on X AyX-r>y. 



We are now ready to put the steps together and give the protocol for classical state splitting of 
PaXaXj^iR (depicted in Figure 4). Alice applies the measurement Wx^^i^x^iYa from (42) followed 



by 



Q 




(50) 



y=l 



and the isometry 



VaXaXaiXa^Ya^AX^, XaYa 



Q 

AXAtXj^y^AX^j^t^^yXA ® WI^VWa ■ 

y=0 



(51) 



Afterwards she sends X^/ and Ya, that is 



c < max[//o(-'^A') 



H^,^{X^\R)py\ + 4 • log - + 1 + logr2 • log \XA'\^ 



(52) 
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bits to Bob (and we now rename to and Ya to Yb)- Then Bob applies 

Q 

^b^Yb =^3^1 \y){y\YB , (53) 

y=l 

followed by the isometry 

Q 

Uxb^Xb^Yb-^XbYb = Yl ^^Bl^Bl^^B ® \y){y\YB ■ (54) 

We obtain a sub-normalized state 

Q 

(taxaXbryb = J2iy paxaXbr ^ \y)iy\yB > (55) 

with p\xaXbR '^Xj.i^Xb (PaXaX^/r) for y = 0, 1, . . . , Q. By the (quasi) convexity of the purified 
distance in its arguments (Lemma 25), and the monotonicity of the purified distance under partial 
trace, we have 

CTAXAXBR^el^XA/^XBiPAXAXA'R) ■ (56) 

Hence, we have shown the existence of an e-error classical state splitting protocol for paXaXa/R 
with classical communication cost as in (52). But by the monotonicity of the purified distance, and 
the triangle inequality for the purified distance, this implies the existence of an (e + |X^/|~^/^)- 
error classical state splitting protocol for PaXaXa/R^ with the same classical communication cost 
as in (52). 

We now proceed by simplifying (52). We have HQ{XA')py < H^in{Xji/)py -)- 1 for y = 0, 1, . . . , Q 
as can be seen as follows: 

2-(^+^) < XmUQy ■ p'xa, ) ^ ^a^k-i (g, • p^^,) <\\qy. py^^, ^ ^ , (57) 
where Amm(/0^^, ) denotes the smallest non-zero eigenvalue of Px^, • Thus, 

rank (g, • p^^, ) < 2^+^ = 2^ • 2 < qy ■ p\^, ^ 2 , (58) 



oo 



and this is equivalent to the claim. Hence, we get an (e -|- \Xai\ ^/^)-error classical state splitting 
protocol for paXaXaiR with classical communication cost 

c < max[/7i„in(XA')p!/ - ii^,MA>\R)py\ + 4 • log ^ + 2 + logr2 • log \Xa'\\ (59) 
< max[i?min(^A')p!' - i?mm(^A'|-R)p^] + 4 • log - 4 log log \Xa>\ ■ (60) 
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Using a lower bound for the max-information in terms of min-entropies (Lemma 13), and the 
behaviour of the max-information under projective measurements (Lemma 14) this simplifies to 



c < max /max(^A' : R)py + 4 • log - + 4 + log log \Xa> \ (61) 

< Im^XA' : i?)p + 4 • log ^ + 4 + log log \Xa'\. (62) 
Furthermore, it easily seen from (49) that 

c + s < HQ{XA')f, + 2 + log log I^A'I . (63) 



As the last step, we reduce the classical communication and shared randomness cost by smooth- 
ing the max-information and the zero-Renyi entropy in (62) and (63), respectively. For that, we do 
not apply the protocol as described above to the state PaXaXj^iR-, but pretend that we have another 
classically coherent (sub-normalised) state paXaX^,r that is (\/8e' + e')-close to paXaX^^iR, and 
then apply the protocol for paXaXj^,r- By the monotonicity of the purified distance, the additional 
error term from this is upper bounded by \/8e' + e', and by the triangle inequality for the purified 
distance this results in a total accuracy now proceed by defining 

PaXaX^,r- Let pXj^,R G B^'{px^,r) such that 

iL^Xa' : R)p = ImU^A' : R)p . (64) 

Furthermore, since the zero-Renyi entropy can be smoothed by applying a projection (Lemma 22), 
there exists Hx^, G V{Xa') with Hx^, < Ix^, such that 

Hi''{XA')p>Ho{XA')p, (65) 

with pxai — ^Xa'PXai^Xa/ G (px^,) classical with respect to the basis {|x)}. By the 

properties of the purified distance [52, Chapter 3], there exists a purification paXaXj^,r £ 
B^^^"^' (paxaX^ir) that is classically coherent on XaXa' with respect to the basis Apply- 
ing the protocol for this state paXaXa/R^ the classical communication cost (62) becomes by the 
monotonicity of the max-information (Lemma 16) and (64), 

c < /^ax(^A' : i?)p + 4 • log ^ + 4 + log log \Xa'\, (66) 

and by (65) the sum cost (63) becomes 

c + s<Hl' {Xa')p + 2 + log log \Xa'\ . (67) 

□ 
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For completeness we also state a converse for the classical communication cost of classical state 
splitting of classically coherent states. 

Theorem 6. Let e > 0, e' > 0, and paXaX^,r G V<('HAX^x^/i?,) be classically coherent on XaXa' 
with respect to the basis {\x)}x(^XaXj^/ ■ Then the classical communication cost for any e-error 
classical state splitting protocol for paXaX^iR lower hounded by^ 

c > I^+i {Xa' : R)p - log + 2) . (68) 

Proof. We have a look at the correlations between Bob and the reference by analyzing the max- 
information that Bob has about the reference (recall that this will be a max-information of the 
form Ima.x{R '■ B) where R is the reference system and B here is a general label for whatever 
Bob's system is). At the beginning of any protocol, there is no register at Bob's side correlated 
with the reference and therefore the max-information that Bob has about the reference is zero. 
Since back communication is not allowed, we can assume that the protocol for state splitting has 
the following form: applying local operations at Alice's side, sending bits from Alice to Bob and 
then applying local operations at Bob's side. Local operations at Alice's side have no influence on 
the max-information that Bob has about the reference. By sending c bits from Alice to Bob, the 
max-information that Bob has about the reference can increase, but at most by c (Corollary 18). 
By applying local operations at Bob's side, the max-information that Bob has about the reference 
can only decrease (Lemma 12). So the max-information that Bob has about the reference is upper 
bounded by c. Therefore, any state uJXgR at the end of a state splitting protocol must satisfy 
/max(-R : Xb)uj < c. But we also need ujXgR pxgR = Tx^,^Xb{pXa>r) by the definition of 
e-error state splitting (Definition 3). Using the definition of the smooth max-information, and that 
the smooth max-information is approximately symmetric in its arguments (Lemma 1), we obtain 
the bound in the statement of the theorem. □ 



V. UNIVERSAL MEASUREMENT COMPRESSION 



In this section, we establish our main result: feedback and non-feedback universal measurement 
compression. Theorem 7 characterizes the trade-off between shared randomness and classical com- 
munication required to simulate many instances of a measurement on an arbitrary input state in 
such a way that both the sender and receiver obtain the outcomes of the measurement (feedback 



^ We do not mention the cost of the shared randomness resource, since the statement holds independently of it. 
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simulation), and Theorem 10 characterizes the trade-ofF for the non- feedback case when only the 
receiver is required to get the outcomes of the measurement. 

Definition 5 (One-shot Measurement Compression). Consider a bipartite system with parties Alice 
and Bob. Let 5 >0, and M : C{T-La) — ^ C,{T~ix) be a quantum- classical channel, with quantum input 
A and classical output X . A quantum protocol V is a one-shot feedback measurement compression 
for Ad with error 5 if it consists of using s bits of shared randomness, applying local operations at 
Alice's side, sending c classical bits from Alice to Bob, applying local operations at Bob's side, and 

\\V -AoM\\() <6 , (69) 

where A : C{'Hx) — ^ ^{T~{-Xa) ^{'Hxb) ^-^ classical copying map, 

^(o") = \x){x\xa \x){x\xb^ (70) 

ensuring that both Alice and Bob obtain the measurement outcome. The quantity c is called the 
classical communication cost, and s is the shared randomness cost. For the case of a non-feedback 
measurement compression, we only require the following condition to hold 

\\V-MU<5, (71) 

because Alice does not need to recover the output of the simulation in this case.^ 

Definition 6 (Universal Measurement Compression). Let M : C{'Ha) C,i3~Lx) be a quantum- 
classical channel. An asymptotic measurement compression for M is a sequence of one-shot mea- 
surement compressions V" for A^®" with error 5^, such that lim^_^QQ S^i — 0. The classical com- 
munication rate is limsup,„_^o^ ^"^n" '^^^ shared randomness rate is limsup„_i.oo ^°^n" (where 
Cn and Sn denote the corresponding costs for the one-shot measurement compressions). 

A. Feedback Simulation 

Theorem 7. Let M : C{A) — J- C{X) be a quantum to classical channel. Then there exist asymptotic 
feedback measurement compressions for Ai if and only if the classical communication rate C and 
shared randomness rate S lie in the following rate region:^ 

C >maxI{X : R)^_M^x){p) (72) 

C + S>maxH{X)Mip), (73) 

"* If we state the task of measurement compression as being that a verifier who is given the reference system and 

classical output should not be able to distinguish the true channel from the simulation, then we should also demand 

that the common randomness and classical communication be private from the verifier. 
^ Note that the two maxima in (72) and (73) can be achieved for different states. 
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where par G V{AR) is a purification of the input state pA £ Si^A). Or equivalently, for a given 
shared randomness rate S, the optimal rate of classical communication is equal to 

C{S) = max|max/(X : R)(M(S)T)ip), maxi7(X)_/K(p) - S^j . (74) 

In particular, when sufficient shared randomness is available, the rate of classical communication 
is given by 

C{oo) = maxI{X : R)(Mm){p)- (75) 

Proof. We first show that the right-hand side of (72) is a lower bound on the classical commu- 
nication rate, and that (73) is a lower bound on the sum rate (Propositition 8). Then we show 
that these lower bounds can be achieved (Proposition 9). The general rate trade-off in (72)-(73) 
and (74) immediately follows, since the shared randomness can always be created by classical 



communication. □ 

Proposition 8 (Converse). Let Ai : C{A) — )■ C{X) be a quantum to classical channel. Then we 
have for any asymptotic measurement compression for M. that 

C > max/(X : i?)(_^^x)(p) (76) 

C + 5>maxii'(X)^(^) , (77) 



where par £ V{AR) is a purification of the input state pA £ S{A). 

Proof. This proposition follows from the converse for the case of a fixed IID source [60, Theorem 
8], since the asymptotic measurement compressions must in particular work for any fixed IID 
input state /o®" (for n — )• oo). To see this explicitly worked out with the feedback assumption, see 



Section 2.4 of [57]. □ 

Proposition 9 ( Achievability) . Let A4 : C{A) — t- C{X) be a quantum to classical channel. Then 
there exist asymptotic feedback measurement compressions for Ad with 

C < max/(X : i?)(_A/(55j)(p) (78) 

C + S <maxH{X)M(p) , (79) 



where par S V{AR) is a purification of the input state pA G 'S{A). 
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Proof. We show the existence of a sequence of one-shot feedback measurement compressions 
for TW^" with asymptotically vanishing error e„, a classical communication rate ^ as in (78), and 
a shared randomness rate ^ such that the sum rate becomes as in (79). Without loss of generality, 
we choose to be permutation covariant.^ The post-selection technique for quantum channels 
(Proposition 28) then applies and upper bounds the error by 

6n = \\MT,^^-V2'^^x-Jo < (n + 1)1^1^-1 • muT^^^-V^.^x^J ^1^ ^Ir'{Carr')\\i , 

(80) 

where Car/?/ ^ purification of the de Finetti state Car — I '^ar d{4'AR) with ipAR G V(Ai?), 
A = R and d{-) the measure on the normalized pure states on AR induced by the Haar measure 
on the unitary group acting on AR, normalized to J d{-) = 1. Hence, it is sufficient to consider 
simulating the measurement on a purification of the de Finetti state: 

^l,RR' = (-M^" X, » ^ ^R') (QrR') ' (81) 

up to an error o^(n + l)^^'"^'^^ in trace distance, for an asymptotic classical communication cost 
smaller than (78). For this, we consider a local Stinespring dilation Ua~>eXaX ^, of the measurement 
M-A^x^, at Alice's side, followed by classical state splitting of the resulting classically coherent 
state (Theorem 5). Let Ua^^^e^x^x^, = U^^^eXaX^, ^"^^ 

^E"X'XX'^,R"R' = UA^^E^X'XX'^fiCARR') ■ (82) 

As mentioned above, this map can be made permutation invariant. For fixed £n > 0, Theorem 5 
then assures that the map outputs a state which is 

4-e„ + 4V2^ + 2- IXa'I""/' (83) 

close to (81) in trace distance,^ for a classical communication cost 

Cn < /SLx i^A' : i?i?')^ + 4 • log — + 4 + log log \Xa'\+ log n , (84) 

and a sum cost 

Cn + Sn< ^o" i^A'h + 2 + log log \Xa'\ + log n , (85) 

^ By the following argument, every protocol can be made permutation covariant. To start with, Alice applies a 

random permutation n on the input system chosen according to some shared randomness. This is then followed 

by the original protocol (which might not yet be permutation covariant), and Bob who undoes the permutation 

by applying tt"^ on the output system. The shared randomness cost of this procedure can be kept sub-linear in n 

by using randomness recycling as discussed in [2, Section IV. D]. 
^ The trace distance is upper bounded by two times the purified distance (Lemma 24). 
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where the last two terms on the right in each of the above expressions come from the fact that 
log log = log log \Xai \ + logn. We now analyse the asymptotic behaviour of (84) and (85). 

By a dimension upper bound for the smooth max-information (Lemma 17), and the fact that we 
can assume \B!\ < (n + l)l^l''-i (Proposition 28), we get 

Cn < It-^A^A' ■■Rh + 2- log[(n + 1)1^1'-^] + 4 • log - + 4 + log log \Xa'\ + logn . (86) 

By a corollary of Caratheodory's theorem (Lemma 29), we write 



i&I 



(87) 



where E V{AR), / = {1, 2, . . . , (n + i)2|^l|R|-2|^ j^.j 

ig/ a probability distribution. Using 
a quasi-convexity property of the smooth max-information (Lemma 19), and for 



X = 2 • log 



(n + 1)1^1'' ^ +4-log — +4 + loglog|X^/| +logn , 



we obtain 



< max/X(^A' : + + l)''^"''!-' 

< max/X(^A' : ^)[(A,«x)(p)]«" + log[(n + l)2|^ll«l-2 



+ X 
+ X, 



(88) 

(89) 
(90) 

(91) 



where the last maximum ranges over all par G V{AR). From the asymptotic equipartition property 
for the smooth max-information (Lemma 23) we obtain 

-2 

^ x:I(Xa' : R)(Mm)(o) + ■ C(en) - 2 • log I 
where C(£n) = 8^/13 — 4 • log e„ ■ {2 + ^ ■ log |A|). By choosing 

.„ = (n+ 1)^1-1^1^), 



c„ < n • maxI(XA' : R)iM^i)ip) + Vn ■ C(en) - 2 • log ^ + log[(n + l)2|^ll«l-2l + ^ ^ (92) 



(93) 



we get an asymptotic classical communication cost of 



c = lim sup — < max I{Xa' '■ R) 



{M®X)(p) 



(94) 



n-!>oo n p 

for a vanishing asymptotic error (80), (83), (93): 

lim sup (5„ < lim sup [(4 • (n + l)4(i-l^l') + 4^2 • (n + l)2(i-l^l') + 2 • \Xa'\~'^'^) (n + l)l^l'-i" 

= . (95) 
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Furthermore, we estimate the asymptotic behaviour of the sum cost (85) by using (87) and a quasi- 
convexity property of the smooth zero-Renyi entropy (Lemma 20). For x' = 2 + log log \ Xa' \ +log?T. 
we get 

c„ + s„<maxFo""(X^Ox(aO«"+logf(^ + l)^'^"^'"^l + x' (96) 

i ^ ' L J 

< maxFo^"(X^Ox(p)®" + log[(n + l)2|^ll«l-2] + ^ , (97) 

where pA G S{A). By the equivalence of the smooth zero-Renyi entropy and the smooth max- 
entropy (Lemma 21), and the asymptotic equipartition property for the smooth max-entropy 
(Lemma 23), we arrive at 

cn + sn< mc,xH'^J^{XA')Mipr- + 2 • log ^ + log \{n + l)2|^ll«l-2l + y (98) 



< n • vc^-kH{Xa')m{p) + ^ ■ - 2 • log ^ • (2 + J • log \Xa'\) 



+ 2 • log ^ + log 



where pA G S{A). By employing (93), we get for the asymptotic limit 

c + s = limsup -(c„ + s„) < max//(Xyi')_A4(p) , (100) 

n— >-oo P 

where pA G S{A). □ 



B. Non-Feedback Simulation 

Theorem 10. Let M : C{'Ha) C,{T~Lx) be a quantum-to-classical channel. Then there exist 
asymptotic non-feedback measurement compressions for Ai if and only if the classical communica- 
tion rate C and shared randomness rate S lie in the rate region given by the union of the following 
regions, 

C >maxI(W : R)b (101) 
p 

C + S> max I(W : XR)r , (102) 
p 

where the state (iwxR ho-s the form 

PwxR = ^^qx\w ■ \w){w\w ® \x){x\x <8)tr^ [i^fw '^1^){par)] , (103) 
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PAR £ V{Har) is a purification of the input state pA G S{'Ha), o-nd the union is with respect 
to all decompositions of the measurement M in terms of internal measurements N = {Nw} and 
conditional post-processing distributions q^iw That is, for all states a, it should hold that 

X x,w 

Or equivalently, for a given shared randomness rate S, the optimal rate of classical communication 
is equal to 

C(S)= min maxlmaxIiW : R)0, maxI(W : XR)b- S 

By the data processing inequality for the mutual information, it holds that I{W : R)p > I{X : 
i?)_v((p), and hence, the classical communication cost can only increase compared to a feedback 
simulation (Theorem 7). However, if the savings in common randomness consumption are larger 
than the increase in classical communication cost, then there is an advantage to performing a 
non-feedback simulation. It follows from the considerations in [40, 57] that the rate trade-offs (74) 
and (105) become identical if and only if the elements of the measurement to simulate are all 
rank-one operators. 

Proof. We see from the converse for the case of a fixed IID source [57, Theorem 9], that the right- 
hand side of (101) is a lower bound on the classical communication rate, and that (102) is a lower 
bound on the sum rate. This is because the asymptotic non-feedback measurement compression 
must work in particular for any fixed IID input state p^^ (as n — )• oo). 

As the next step, we show that these lower bounds can be achieved. The general rate trade-off 
in (101)-(102) and (105) then immediately follows, since shared randomness can always be created 
by classical communication. 

The idea for the achievability part is as follows. Given a particular decomposition of the 
measurement Ai = {A4x} as {J2w1x\w ' -^w} as stated above, Alice and Bob just use a feedback 
measurement compression protocol (as in the proof of Theorem 7) to simulate the measurement 
J\f = {Mw}- This is followed by a local simulation of the classical map q^iw at no cost at Bob's 
side. Finally, Alice and Bob can use randomness recycling to extract H^i^(W\RX)p bits of shared 
randomness back [2]. In the one-shot case, this leads to a classical communication cost of Ima.x{W : 
R)l3, and a sum cost Imax(M^ : RX)i3. For technical reasons, we smooth the states using typical 
projectors (see Appendix E for background on typical projectors) and arrive at the rates given in 
the statement of the theorem. 



(105) 
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Let {qx\wj -Mj} be a fixed decomposition of A4. As in the feedback case (Theorem 7) we employ 
the post-selection technique (Proposition 28) to upper bound the error for one-shot non-feedback 
compressions for A^®" by 

Sn = \\MT,^^-Vl^x Jo (106) 
< (n + 1)1^1^-1 • mMT,^^-V2^x,)®^T^^R'iQRR')\\i , (107) 

where Carr' ^ purification of the de Finetti state = f tp'^^ d{'ijjAR) with ipAR S V{T-Lar): 
A = R and d{-) the measure on the normalized pure states on Har induced by the Haar measure 
on the unitary group acting on Har, normalized to f d{-) = 1. Hence, it is sufficient to consider 
simulating the measurement A^®" on a purification of the de Finetti state 

^x,RR' = {MT^x, ® ^ ^R') (CIrR') ' (108) 

up to an error o^(n + l)^"'"^'^^ in trace distance, for an asymptotic simulation cost smaller than 
in (101) and (102). For this, the idea is to consider a local Stinespring dilation Va^eWaW^, of the 
measurement Ma^Wa Alice's side, followed by classical state splitting of the resulting classically 
coherent state (along Theorem 5). Let VX^eWaWa^ = ^a^eWaWa' ^"^"^ 

'^EWaWa'RR' ^ ^A-^EWaWa'(^ARR') ■ (109) 

However, Alice and Bob will not execute the protocol with respect to the state ^eWaWa/RR' directly, 
but they will do so with respect to another pure, sub-normalized state IeWaWa'RR' 
classically coherent on WaWa' with respect to the basis {\w)}w,zw^, and such that 

WI'eWaWa'RR' ~ '^EWaWa'RR'W^ - ^" ' (■'-10) 
for SOII16 Sfi > 0. By ci corollary of CcircittiBOciory's tlieorcin (Lemmci 29), we write 

aij = E^^-(^Ai?f"' (111) 

where e V{nAR), / = {1, 2, . . . , (n + l)2|^ll^l-2}, and {pi} a probability distribution. From 
this, we define 

^TwaWa,R = m^EWAWA> 0lR){cT\n)f'', (112) 
as well as its reduction as a classical-quantum state on the systems W^z-R": 

'y'wAR = ^Pw-liiw^'li) \w''){w''\w2, ® T^n", (113) 
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for some distribution pp^/n|j(t/;"'|z). On this state, we act with typical projectors to flatten its 
spectrum as we need, defining the projected state 7^4^^ as follows: 

where is a typical projector corresponding to the distribution py^n\i{w'^\i) , 5 a 

conditionally typical projector corresponding to the conditional state on the system i?", and 
n"i ^ is a typical projector corresponding to the state 7)^" (see Appendix E for details of typical 
projectors). It follows from the properties of typical projectors that the projected state 7^^ 
becomes arbitrarily close in trace distance to the original state 7^^: 

for some > and sufficiently large n. The equivalence of the trace distance and the purified 
distance (Lemma 24) together with Uhlmann's theorem then imply the existence of some subnor- 
malized pure state I^eWaWj^iR such that 

(116) 

Hence, we get by (111) and (109) that 

IeWaWaiR ^ Z^Pi' ^EWaWa'R 

(117) 

iei 

is En-close to ^eWaWa'R purified distance. By features of the purified distance [52, Chapter 3], 
and the equivalence of the trace distance and the purified distance (Lemma 24), we then get that 
there exists an extension ^ew^Wa'RR' °^ ^eWaWa'R ^i^h the desired properties such that (110) 
holds. 

Alice and Bob will now act with a classical state splitting protocol for WaWa' with respect 
to the classically coherent state IeWaWa'RR" However, we do not directly use our result about 
classical state splitting (Theorem 5), but instead employ a non-smooth version that is implicit in 
the proof of Theorem 5. It follows from (61) and (62) that for an (e„ + IIV^^/ |~")-error (in purified 
distance) classical state splitting protocol for WaWa', a classical communication cost 

Cn < max /max(VFA' : RR'hn,y + 4 • log — + 4 + log log \Wa' | + log n (118) 

y £n 

is achievable, and it follows from (49) and (63) that the sum cost becomes 

Cn + Sn< maxi7o(l^A')7"'!' + 2 + loglog |PK4'| + logn , (119) 
y 
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where the measurement outcomes y are with respect to the pre-processing measurement defined 
in (42). This provides Bob with the measurement outcomes of N for the fixed de Finetti type 
input state Carr'-, and a total error of (3 • e„ + 2 • iTy^/l"") in trace distance. A local simulation 
of the classical map qx^\w" at no cost at Bob's side then provides Bob with the measurement 
outcomes of M. as desired (again for the fixed de Finetti type input state Qarr' and the same 
error). However, the sum cost of this non- feedback measurement simulation can be reduced by 
invoking an additional randomness recycling step as in Ref. [2]. We do this by having Alice and 
Bob apply, conditioned on y, a strong classical min-entropy extractor on W against the (quantum) 
side information XRR' (Proposition 27), and this lowers the sum cost to 

c„ + s„ < max (iJo(^A')7">f - i^mm(W^A'|^i?'^A')7"'f ) + 4 • log — + 2 + log log \Wa'\ , (120) 

y ' ' En 

for an additional error e„ in trace distance, leading to a total error of 

(4-e„ + 2-|VK4,r") (121) 

in trace distance. The min-entropy extractor is performed with respect to the following typical 
projected state, in order to increase the amount of randomness that can be extracted: 

^xwar = 1^ )pw-^\i{w Wn^ \x ){x |x" o 

II5 \W ){W \wi,i-i-s <^^V,<5 7*''"",'5^R" ^V'™",<5 • ^ ' 

In the rest of the proof, we bring the classical communication cost (118) and the sum cost (119) 
into the right form, and show that the asymptotic error for the measurement simulation (106) be- 
comes zero. By the behavior of the max-information under projective measurements (Corollary 14), 
a dimension upper bound for the max-information (Lemma 17), the fact that we can assume 
(Proposition 28), and a quasi- convexity property of the max-information (Lemma 19), we 

get 

Cn < max/max(W^A' : ^)^»,n + X , (123) 

where 

X = 2 • log ((n + l)l^l'-i) + log ((n + i)2|^ll-R|-2) + 4 . log ^ + 4 + log log \Wa'\ + logn . (124) 

By an upper bound on the max-information (Lemma 13), and a lower bound on the conditional 
min-entropy (Lemma 11), this can be estimated to be 

Cn < max {HR{WA')^^,n - H^i,,{WA'R)^^,n + Ho{R)^,,n) + x ■ (125) 
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By (114), as well as the properties of typical projectors (see Appendix E), we get 

Cn<n- max {H{WA>)^^ - H{Wa'R)^^ + H{R)^^) + 5nc(5 + x (126) 
< n • max [H{Wa>)m{p) " H{WA'R){Mm){p) + H{R)p) + 5nc6 + x , (127) 

where par G V{T~iAR), c is a constant, and (5 > is the typicality tolerance. 
By choosing 

e„ = (n + 1)4(1-1^1') , (128) 
we finally get an asymptotic classical communication cost of 

c = limsup — < max/(W^/ : R)^ , (129) 

n—^oo ITt P 

where par G V{T~iAR)-, I^Wj^iR is as in (103), and a vanishing asymptotic error (106), (121), 

limsup(5„ < limsup ((n + l)l^l'-i • (4 • (n + l)4(i-l^l') + 2 • =0 . (130) 



For the sum cost (119) we get by the definition of the measurement in (42) with outcomes y, and 
a line of argument as in (58) that 

Cn + Sn< max (-frmin(Wyl')^n,a - H^i^{WA'\RR! XAi)'^^,y) 

+ 4-log— + 2 + loglog|W^/| +logn (131) 

< max/max(M^A' : RXA')'y"'y 
y 

+ 2 • log ((n + 1)1^1"-^) + 4 • log — + 2 + log log \Wa'\ + logn , (132) 

where we used a lower bound on the max-information (Lemma 13), as well as a dimension upper 
bound for the max-information (Lemma 17), and the fact that |-R'| < (n+l)'"^'^"^ (Proposition 28). 
Using similar arguments (see Appendix E) as in the estimation of the classical communication cost, 
we arrive at 

c + s = limsup " " < max/(W^' : RXa')/3 , (133) 

where par £ V('Har), and Pwj^iRXj^i is as in (103). By minimizing over all decompositions of the 
measurement M as in (104), the claim follows. □ 
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VI. EXTENSIONS AND APPLICATIONS 

a. Structured State Splitting Scheme The state splitting protocol presented in Theorem 5 has 
the drawback that the permutations Uy used by Bob must be chosen at random and little is known 
about the structure of the unitaries Vy. We can remedy this by basing the state splitting protocol 
used in Theorem 5 on a modified state merging protocol instead of that in Lemma 2. The new 
protocol has the advantage that Alice's classical operation P (recall that the roles are reversed) is 
a linear function rather than an arbitrary permutation, though still randomly- chosen, and Bob's 
unitary operation V is based on the decoder of an information reconciliation protocol. We now 
give a sketch of this modified state merging protocol. 

The protocol is based on the observation from [8, 45] that state merging is a by-product of an 
entanglement distillation protocol in which Alice measures the stabilizers of a Calderbank-Shor- 
Steane (CSS) code such that, given the resulting (classical) syndrome results. Bob could determine 
both the amplitude (logical X value) and phase (logical Z value)^ of Alice's remaining encoded 
system by using his systems. Indeed, for state merging of classically coherent states such as 
PXaXbBR in Lemma 2, the situation is considerably simpler since Bob can already determine the 
amplitude of Alice's system Xa by measuring Xb- For simplicity, let us regard Xa as a collection 
of A; = log \ Xa\ qubits. 

Thus, from the analysis of [8, 45], all that remains is for Alice to measure a sufficient number 
of phase stabilizers from an error-correcting code to enable Bob to determine the phase of her 
encoded systems by using the syndromes and his systems Xb and B, with probability of error 
at most e. Use of a linear code ensures that Alice does not damage Bob's amplitude information 
in the course of trying to increase his phase information. Since the task at hand is equivalent 
to information reconciliation, the number of phase stabilizers needed for this purpose is no more 
than H^a_^{XA\XBB)p + 2 log ^ + 4 [46], where Xa denotes the phase observable conjugate to the 
amplitude observable Xa- 

To measure the phase stabilizers, Alice can apply a suitable unitary operation to all of her 
systems and then simply measure the phases of a certain subset of the outputs which correspond 
to the stabilizers [41]. But for stablizer codes, this unitary just implements a linear transformation 
in the phase basis of the k qubits, which can equally well be regarded as a linear transformation in 
the amplitude basis {\x)}x&Xa- Therefore, just as in the original protocol, Alice applies a "classical" 
transformation of her system and sends one part of the output to Bob. 

* Associating X with amplitude instead of phase contravenes the usual convention in the QECC literature, but 
better fits the notation of the current paper. 
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For his part, Bob can complete the state merging protocol by coherently implementing the 
decoder from the information reconciliation protocol, a construction of which based on the pretty 
good measurement is given in [46]. 

Finally, the number of entangled systems generated in the state merging protocol is equal to 
the number of systems left at Alice's side, or \Xa\ — H„iax{XA\XBB)p — 21og^ — 4. Lemma 30 
shows that this is in fact greater than Haiin{XA\R)p — 2 log ^ — 4. Thus, the stabilizer-based state 
merging protocol achieves the same costs as the state merging protocol of Lemma 2 (up to terms 
of order log ^). 

b. Fixed IID Source The case of a fixed IID source also follows easily from our analysis. We 
can simply apply the one-shot protocol from Theorem 5 to the case of a fixed IID source and then 
invoke the asymptotic equipartition property for the smooth max-information and the smooth 
max-entropy. In this way, we provide an alternative proof of this special case that avoids the use 
of typical projectors and the operator Chernoff bound [60]. 

c. Instrument Compression In Winter's original paper on measurement compression, ad- 
ditional arguments were required to establish that a POVM (positive operator-valued measure) 
compression protocol can function as an instrument compression protocol, where for an instrument 
compression protocol, Alice and Bob receive the classical outcomes of the measurement while Alice 
obtains the post- measurement states (see Section V of [60]). We note that our protocol here al- 
ready functions as an instrument compression protocol due to our use of the classical state splitting 
protocol as a coding primitive. 

d. Universal Measurement Compression with Quantum Side Information We briefly mention 
that there is no point in considering a protocol for universal measurement compression with quan- 
tum side information (similar to the observation in Section 6.3 of [18]). In such a scenario, the 
receiver would obtain some quantum side information correlated with the state on which the mea- 
surement should be simulated (see [57] for the case of measurement compression with quantum 
side information for a fixed IID source). Though, since a universal protocol should simulate the 
measurement with respect to an arbitrary input state, a special case of this input is one in which 
the quantum side information and input state are in a product state. Thus, the universal protocol 
given here is suitable for this case. This occurs simply because our simulation is with respect to the 
diamond norm, and the diamond norm is known to be robust under tensoring with other systems 
upon which the channel of interest does not act. 

Another way to see this is that one could imagine devising a protocol for which quantum side 
information is taken into account. Based on the results in Ref. [57], we would expect the rate 
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of classical communication for such a protocol to be equal to the following information quantity: 
maxp I{X : R \ B)(X(^M<^x){p) ' where pab G S{AB) is an input state with quantum side information 
in the system B, and prab G V{RAB) is a purification of pab- Though, as shown in Theorem 16 
of [18], the above information quantity is actually equal to the information quantity in (5), so 
that there is no improvement in the communication rate from the availability of quantum side 
information. 



VII. CONCLUSION 



We have justified the information-theoretic measure in (5) as quantifying the information gain 
of a quantum measurement, by providing an operational interpretation in terms of a protocol 
for universal measurement compression. The main tools used to prove this result are the post- 
selection technique for quantum channels and a novel classical state splitting protocol based on 
permutation-based extractors. 

There are a number of open questions to consider going forward from here. Given that there are 
applications of "information gain" or "entropy reduction" in thermodynamics [35] and quantum 
feedback control [21], it would be interesting to explore whether the quantity in (5) has some 
application in these domains. Also, Buscemi et al. showed that the static measure of information 
gain in (3) plays a role in quantifying the trade-off between information extraction and disturbance 
[9], and it would be interesting to determine if there is a role in this setting for the information 
quantity in (5). 



ACKNOWLEDGMENTS 



We acknowledge discussions with Francesco Buscemi, Matthias Christandl, Nilanjana Datta, 
Patrick Hayden, Renato Renner, and Marco Tomamichel. MB and JMR are supported by the Swiss 
National Science Foundation through the National Centre of Competence in Research 'Quantum 
Science and Technology'. MB is also supported by Swiss National Science Foundation grants 
PP00P2-128455 and 20CH21-138799 and German Science Foundation grant CH 843/2-1. JMR is 
also supported by European Research Council grant 258932. MMW acknowledges support from 
the Centre de Recherches Mathematiques at the University of Montreal. 



33 



Appendix A: Entropies 
Lemma 11. [47, Lemma 3.1.10] Let pab £ S<{T-Lab)- Then we have that 

H^in{A\B)p > HraUAB)p - H^{B) p . (Al) 
The max-mutual information is monotone under local operations. 

Lemma 12. [7, Lemma B.I4] Let pab G S<:{AB), and let £ he a quantum channel of the form 
E = Ea® £-B- Then we have that 

/max(^ : B)p > /max(A : S)^(^) . (A2) 

The max- information can be upper and lower bounded in terms of entropies. 

Lemma 13. [7, Lemma B.IO] Let pab S S<{AB). Then we have that 

Hr{A)p - H^in{A\B)p > /^ax(^ : B)p > H^^{A)p - H^,^{A\B)p , (A3) 

Hb{p) is defined as the negative logarithm of the smallest eigenvalue of p on its support [7]. 
The following lemma is about the behavior of the max-information under projective measure- 
ments. 

Lemma 14. [7, Corollary B.16] Let pab G S<{AB), and let P = {Pa]^^^ be a collection of 
projectors that describe a projective measurement on A. For ti\j^\pA\ 7^ 0, let pi = tr[P^/9^], and 
Pab — Pi"^ ' PaPabPa- Then we have that 

/max(^ : B)p > max/max(A : B) r , (A4) 

i 

where the maximum ranges over all i for which is defined. 

Lemma 15. Let e > 0, and let pxR G S{XR) he classical on X with respect to the hasis {\x)}xex- 
Then there exists pxR £ ^^{pxr) classical on X with respect to the hasis {\x)}x<^x such that 

/^a,(X : R)p = I^UX : R)p ■ (A5) 

Proof. This is standard and can be proven exactly as in [52, Proposition 5.8]. □ 

We need the following monotonicity of the max-information. 

Lemma 16. Let par G S{AR), and Ha G V{A) with Ha < ^a- Then we have that 

/max(^ : R)p > /max(^ : R)upU ■ (A6) 
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Proof. Let aR G <S{R), and let A S M be such that /max(^ : R)p = ^max(Pyl_R||/OA ® (Jr) = log A. 
Then we have that A • /9a (8 > PAR, and with this 

A • PA ® O-fi > A • UaPA^A '^crR> UAPARXi-A ■ (A7) 

Hence, we have log A > Drna.A^APARUAW'n.APA'nA <8) an) > /max(^ : R)npn- □ 

The following is a bound on the increase of the smooth max-information when an additional 
subsystem is added. 

Lemma 17. [7, Lemma B.9] Let e > 0, and let pabr S S{ABR). Then we have that 

Ll,,{A : BR)p < Ll^M ■B)p + 2. log \R\ . (A8) 

The following is a strengthening of the bound in Lemma 17 when the additional system is 
classical. 

Lemma 18. Let pabx £ S{ABX) he classical on T-Lx with respect to the basis {\x)}x<^x- Then 
we have that 

: BX)p < I^ax(A : B)p + log \X\ . (A9) 

Proof. Let as G <S{B) be such that 

/max(-4 : B)p = Djna.^{pAB\\PA <^ CTb) = log , (AlO) 

that is, /i G M is minimal such that p-pA^^o^B > PAB- This implies p- PA^O'B^ jx] — P^^^"^^' 
But we have by [47, Lemma 3.1.9] that pab'^'^x > /oabCi and hence • /ja "SDcb "SD > jXyPASX- 
Now, let A G M be minimal such that A-/9a'X>o"b<X'-^ > Pabx- Thus, it follows that A < p ■ \X\, 
and from this we get 

/max(^ : BX)p < D,aaAPABx\\PA 0"b — ) = log A (All) 
< ^max(pAB||pA (Tb) + log \X\ = /max(^ : B)p + log \X\ . (Al2) 

□ 

The smooth max-information is quasi-convex in its argument in the following sense. 

Lemma 19. [7, Lemma B.18] Lete > 0, and let pab = "l^iei PiPab ^ S<{AB) with p\^ G S<{AB) 
for i G I. Then we have that 

Il^M ■■ B)p < max Ll,M ■ B)p. + log |/| . (A13) 
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The following is a quasi-convexity property of the zero-Renyi entropy. 

Lemma 20. [4, Lemma 26] Let e > Q, and let pA = J^jLiPjp'a ^ '^(^) ^^^^ Pj > f^''" 3 — 
1, . . . ,N. Then we have that 

H^{A)p < max H'o{A)p, + log iV . (A14) 

The smooth max-entropy and smooth zero-Renyi entropy are equivalent in the following sense. 

Lemma 21. Let e > 0, e' > 0, and pA G S{A). Then we have that 

H^'iA), > Hi,M)p > <+^(^)p - 2 • log ^ . (A15) 

Proof. Since the (unconditional) max-entropy is the Renyi entropy of order 1/2, the first inequality 
just follows from the ordering of the Renyi entropies [42, 48]. 

The idea for the proof of the second inequality is from the supplementary material [6, Lemma 
13]. Let a A G B^' {pa) such that H^^y-{A)p = ffmax(^)(75 and let a a = Sj^iKX^U be a spectral 
decomposition of aA where the eigenvalues ti are ordered non-increasingly. Define the projector 
= '^Ziyk KX^U' 3 be the smallest index such that tr Il\aA < e, and define Ha = — n-^ 
as well as (ja = HyicTAnyi. By [6, Lemma 13] we have 

i^max(^)a > - log sup{ A : > A • 4} - 2 • log ^ > log tr [a^] - 2 • log ^ (A16) 

= Ho{A)^- 2 -log^ , (A17) 

and furthermore 

P{^A, pa) < P{aA, pa) + P{^A, (Ja) <e' + PiUAaAUA, aA) < e' + ^1 - (tr[n^aA])^ (A18) 



<e' + \Jl- (l-e)^ <e' + V2e , (A19) 

where we used the triangle inequality for the purified distance, and a gentle measurement lemma 
for the purified distance (Lemma 26). Thus, we have 

Hi^^{A)p = H^U^), > Ho{A)a - 2 • log ^ > <+^(A), - 2 • log ^ . (A20) 

□ 



The zero-Renyi entropy can be smoothed by applying a projection. 



36 



Lemma 22. Let e >0, and let pA S S{A). Then there exists IVa £ 'P(^) with Ha ^ 1a; diagonal 
in any eigenbasis of pA, 

H'o{A)p > Ho{A)upU , (A21) 

and UAPAUAeB^'ipA). 

Proof. The idea for the proof is from the supplementary material [6, Lemma 14]. Let a a G B'^{pa) 
such that Hq{A)p = Hq{A)u. It follows from the supplementary material [6, Lemma 8], that aA 
can be taken to be diagonal in any eigenbasis of pA- Define 

aA = CTA- {(TA - Pa}+ = pa - {pa - crA}+ , (A22) 

where {•} denotes the positive part of an operator. This implies a a < o'a, and we then have 
Hq{A)p > Hq(A)^. Since (Ta and pA also have the same eigenbasis, it follows that there exists 
Ha G 'PiA) with Ua < 1a such that a a = ^APA^A- Furthermore, we get by the equivalence of 
the trace distance and the purified distance (Lemma 24) that 



P{PA, (ta) < \/||/OA-^A||i + |tr[pA] -tr[cTA]| = ^2-tT [{pA - cta}+] < ■ \\pA - (Ja\\i (A23) 
< ^i-PipA,aA) < \/4i . (A24) 



□ 

The fully quantum asymptotic equipartition property for the smooth max-information and the 
smooth max-entropy is as follows. 

Lemma 23. [7, Lemma B.21][53, Theorem 9] Let e > 0, n > 2- {1 - e^), and pab <^ S{AB). Then 
we have that 



: S)p«" <I{A:B)p+^-^--- log ^ (A25) 
n ^ \ n n 24 



^ < H{A)p + , (A26) 

n \ n 



where ^(e) = 8^13 - 4 • loge • (2 + ^ • log|A|), andr](e) = 4Vl-2-loge- (2 + ^ • log|^|). 



Appendix B: Misc Lemmas 



The following gives lower and upper bounds to the purified distance in terms of the trace 
distance. 
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Lemma 24. [54, Lemma 6] Let p,cr £ 5<(j4). Then we have that 

^ • \\PA - (^a\\i < P{pa,(^a) < VWpa - (^a\\i + \tr[pA] - tr[fTA]| . (Bl) 

The purified distance is convex in its arguments in the following sense. 

Lemma 25. [7, Lemma A. 3] Let p\, a\ G '5<(A) be with p\ a\ for i £ I, and {pijie/ a 
probability distribution. Then we have that 

^PiPA-eY.Pi''A ■ (B2) 
i£l i&I 

The following is a gentle measurement lemma for the purified distance. 
Lemma 26. [6, Lemma 7] Let pA S S{A), and Ha G Pi^^) with Ha < H-A- Then we have that 



P{pa,Hapa^a) < VI - (tr[n>A])' . (B3) 



Appendix C: Extractors Based on Permutations 

The following proposition concerns permutation-based extractors (operations that extract uni- 
form randomness independent of an adversary's information), and it is critical in establishing our 
protocol for state merging of classically coherent states. 

Proposition 27. [51, Section 5.2] Let pxR. G S{XR) be classical on X with respect to {\x)}x<^x, 
and X = X1X2. Then we have that 



\X\l 



\Xi\ ^'^^ 



< J\Xl\-2-Hmin{X\R), ^ (CI) 



where ¥{X) denotes the group of permutations matrices on 1-Lx with respect to {|x)}^gX; defined 
as P(7r)|x) = |vr(x)) for n G ^ixi; the symmetric group on {1,2, ... , \X\}. 

Appendix D: The Post- Selection Technique 

The following proposition lies at the heart of the post-selection technique for quantum channels. 

Proposition 28. [10] Let e > 0, and let £^ and T\ be quantum channels from £(A®") to C{B). If 
there exists a quantum channel K.,^ for any permutation vr such that {8^ — T'j^o'k = Ktjo{8j^—T''^, 



then £^ and F\ are e-close whenever 



\\{{£l-n)®^RR'){QlRR')\\i<e{n + l)-^\^\'~'^ , (Dl) 
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where Carr' ^ purification of the de Finetti state Car = / (^ar^('^^R) '"^^^^ '^ar £ V(^-R), A = R 
and d{-) the measure on the normalized pure states on AR induced by the Haar measure on the 
unitary group acting on AR, normalized to j d{-) = 1. Furthermore, we can assume without loss 
of generality that \R'\ < (n + 

A straightforward application of Caratheodory's theorem gives the following. 

Lemma 29. [7, Corollary D.6] Let Cj^j^ = J a^^d{aAR) as in Proposition 28. Then we have that 
= with uj\^ G V{AR), i G {1, 2,..., (n + 1)21^11^1-2}, ^ probability 

distribution. 



Appendix E: Typical Projectors 

A sequence x" is typical with respect to some probability distribution pxix) if its empirical 
distribution has maximum deviation 6 from pxix). The typical set is the set of all such 
sequences: 

■ 1 



7V(x|x") -px(x) 
n 



<6 yxeX}, (El) 



where N{x\x"') counts the number of occurrences of the letter x in the sequence x". The above 
notion of typicality is the "strong" notion (as opposed to the weaker "entropic" version of typicality 
sometimes employed [12]). The typical set enjoys three useful properties: its probability approaches 
unity in the large n limit, it has exponentially smaller cardinality than the set of all sequences, 
and every sequence in the typical set has approximately uniform probability. That is, suppose 
that is a random variable distributed according to px^{x^) = px{xi) ■ ■ -pxixn)-, e is positive 
number that becomes arbitrarily small as n becomes large, and c is some positive constant. Then 
the following three properties hold [12] 

Pr{X" G T/"} > 1-e, (E2) 
|r/^"| <2"[^W+^^], (E3) 
Vx^gT/^": 2-'^[^(^)+^'^] <px"(x") < 2-"[^W-'=^1. (E4) 

These properties translate straightforwardly to the quantum setting by applying the spectral 
theorem to a density operator p. That is, suppose that 

p = ^px{x)\x){x\, (E5) 
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Tin I I ■ 

p ,5 = span<; \x 



-iV(x|x") -px{x] 



<5 VxGA^L (E6) 



for some orthonormal basis Then there is a typical subspace defined as follows: 

1 

n 

and let 11^ ^ denote the projector onto it. Then properties analogous to (E2-E4) hold for the typical 
subspace. The probability that a tensor power state p®" is in the typical subspace approaches unity 
as n becomes large, the rank of the typical projector is exponentially smaller than the rank of the 
full n-fold tensor-product Hilbert space of /o*^", and the state p*^" "looks" approximately maximally 
mixed on the typical subspace: 

TV{n^,,p«"}>l-6, (E7) 
Tr{n;j^5} < 2"[-^(^)+^'5], (E8) 

2-n[H{B)+c5] jjn^ < j^n^^ jjn^ < 2^n[H{B)~c5] jjn^^ (£9) 

where H{B) is the entropy of p. 

Suppose now that we have an ensemble of the form {px{x), Px}, and suppose that we generate 
a typical sequence x" according to a "pruned" distribution (defined as a normalized version of 
Px^{x^) with support on its typical set and zero otherwise), leading to a tensor product state 
Px^ = (X" • • • (8) Px„ ■ Then there is a conditionally typical subspace with a conditionally typical 
projector defined as follows: 

ni„,. = (K)n^:.., (Eio) 



where Ix = {i : Xi = x} is an indicator set that selects the indices i in the sequence x"' for which the 
i^^ symbol Xi is equal to x £ X and n^"" ^ is the typical projector for the state px- The conditionally 
typical subspace has the three following properties: 

IV{n^^„,,p,n}>l-e, (Ell) 
Tr{n;j^„ 4 < 2'^[-f^(^l^)+^^l, (E12) 

where H(B\X') = YlxP^i^)^iPx) conditional quantum entropy. 

Let p be the expected density operator of the ensemble {px{x), Px} so that p = Yl,xPx{x)px- 
The following properties are proved in Refs. [19, 56, 58]: 

Vx" G Tf : TV{/5,n Hp} > 1 - e, 

EPx-(^K"<[l-e]"V®^ (E14) 
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In order to justify some of the estimates made in Section VB, we use the above estimates on 
eigenvalues and support sizes. For the classical communication cost, we consider 

HdW^)^. - H^.UW'R')^. + Ho{R^)^.. (E15) 

The smallest nonzero eigenvalue of the reduced state on is larger than 2~'^[^(^)+'^''] due to the 
typical projection on W^. Thus, we have that 



HniW) < n[H{W) + c5]. 
The largest eigenvalue of 7^^^ is bounded by 

2-n [h{W)-, -cS] ^-n [h{R\W)-, -cS] 



(E16) 



(E17) 



due to the typical projection on and the conditionally typical projection on i?". So we have 
that 



Hrrnn{WR'')^^ > n[H{WR)^. + 2c6 

The size of the support of i?"" is bounded from above by 

due to the outermost projection on i?". Thus, we have that 

Ho{R'')^^ <n^H{R)-, + 2c6 
The above development then gives the following bound: 

Hr{W)^, - i/min(VF"i?")^, + Fo(i?")^, < n[l{W- R)^. + 5c5 
We have similar arguments for bounding the shared randomness cost: 

By the same argument as above, we have that 

The largest eigenvalue of 7^x_R bounded by 

[h{W)-,~-c5\ ^-n \h(X\W)-,-c^ ^-n [h(R\W)-, -c5 
^ ^-n[H{WX)^,-2cS]^-n[H{R\WX)^,-cSj 
^ ^-n[H{WXR)-,-3c5 



(E18) 

(E19) 

(E20) 

(E21) 

(E22) 

(E23) 

(E24) 
(E25) 
(E26) 
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where we have used the fact that H{R\W);:^i = H{R\WX)^i because the state on R is independent 
of X. Thus, we have that 



H^i^{WX''R'')-i > n H{WXR)-, - 3c5 . (E27) 



Finally, the support of is bounded again by 2^''\^'^^^\^'^'^^\ ^ ^j^g the typical projections, 

so that we have 

ifo(i?"X")^i < n \h{RX)^, + 2c6\ . (E28) 



The above development then gives the following bound: 

Hr{W)^, - H^UW^'X^R'')^, + iIo(i?"X«)^i < n [l{W; XR)^, + QcS] . (E29) 

Appendix F: Uncertainty Relation 
Lemma 30. For every \'4^){'4^\abr € V{ABR) and observable (measurement) Za, we have that 

H^iniA\B)^ + Hme^{ZA\R)i; < log |^|, (Fl) 

Proof. Define A = Hinm{A\B)^ and let as € S{B) be such that 

ipAB <2-HA^aB. (F2) 

The measurement procedure can be described by an isometry Ua-^za whose action is specified by 
Ua-^za\z)a = \z)z\z)a, where {\z)} are the basis states associated with the (projective) measure- 
ment. Applied to ipAB this yields 

CzAB = Ua^zaiPabU^a-^za i^^) 
< 2-^ Ua-^za{U ® <^b)u]^_^za 

= 2-^J2\^)i^\z'^\^){z\A<^<JB (F5) 

z 

<2-^lLzA®(yB (F6) 

= 2-(^-'°S 1^1) 1^ ® TTA ® (TB, (F7) 

where -ka = '^a/\A\. Thus, /j, = \ — log \A\ and tta <X) ctb are feasible for Hinm{Z\AB)^, meaning 

H^^{Z\AB)^>X-log\A\. (F8) 
Therefore the first claim follows, since H^iii{Z\AB)^ = —Hiaax{Z\R)^ = —Hinax{ZA\R)ip- □ 
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